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1. INTRODUCTION 

Majority of rural India is still dependent on agriculture for the livelihood; also agriculture is the 
biggest sector of economy. Though India has revolutionized in the area of agriculture, still there is scope for 
improving the methods of agriculture and crop yield enhancement using more scientific and innovative 
approaches. A lot of researches happen in the area of agriculture every day. Farmers still resist to apply 
modern techniques for agriculture because of difficulty faced in getting adopted to new approaches [1], [2]. If 
the approaches are easily accessible with cost and time efficient methods, then the number of farmers 
switching to modern techniques from tradition way will be more. Agriculture is not only the main sector of 
economy, it also provides food to people and gives the raw materials to industries. The growing demand to 
provide food also encourages to improve the agricultural methodologies 

A lot of farmers still need awareness about the soil to promote healthy crop growth and to increase 
the yield and income. Soil is the vital component of agriculture. Information regarding the soil like fertility, 
estimated yield, lacking components in the soil, the crops which can be grown from the soil and many more 
soil related things would be beneficial to all the farmers to choose the correct crops which can be grown in 
their land, farmer should know which crop’s growth is facilitated and which are not so that unpredictable 
circumstances are avoided after sowing the seeds. If farmer grows the crops by considering his economic 
conditions, soil parameters and available facilities, it would turn into a boon for him in future as he can 
expect healthy crop growth, more yield and better income. So, if farmers are guided in this right direction 
using modern technologies, well-being of the farmers are assured and country’s growth is also possible. 
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Soil has several factors or parameters in it, the proper composition of each and every parameter 
would help in growth of plants. Major components are macronutrients like NPK and micro nutrients such as 
zinc, iron, manganese, and magnesium and other factors like pH level, moisture content, temperature, electric 
conductivity, organic carbon and external factors like climatic conditions and many more factors, all together 
facilitate the crop growth [3]-[6]. The technique like artificial Intelligence utilising the data of soil parameters 
can be used to predict fertility rate, recommend crops and nutrition in minimum time and cost. This would 
make farmer convenient to know about his soil and choose the better crop. Also, the suggested nutrients or 
fertilizers would help to enhance the crop growth and yield [7]-[11]. 

Deep neural network, the part of deep learning is a branch of artificial intelligence and sub branch of 
machine learning. Neural network works like a human brain and has capability to learn from the data. Deep 
neural network has many layers in the network. Each layer contains multiple neurons and neurons are 
connected to each other, Neurons process the information and pass it on to the next layer. Activation function 
and weights are responsible for the strengths of the signal. Deep neural networks can solve various modern 
day problems and produce accurate results [12]. Robustness, parallel computation, self-learning ability and 
flexibility are the major pros of neural network. If good amount of data is provided to deep neural network 
then classification and regressions are performed well hence it is a great solution for predicting fertility and 
suggesting the right crops and nutrients [8], [12]-[15]. We have discussed the solution for crop prediction 
which is a part of our ‘soil fertility and crop friendliness detection and monitoring system’ in the subsequent 
sections of our paper which includes architecture, methodology and results of the proposed work. The 
prediction of crops is a multi-class classification problem where a category or a particular class is identified 
as a result. There are several classifiers present, out of which we have chosen few well known classifiers to 
test and compare against the accuracy of our deep neural network model. 


2. PROPOSED ARCHITECTURE 

The ‘soil fertility and crop friendliness detection and monitoring system’ is implemented using deep 
neural networks and machine learning techniques to predict the fertility rate of the soil and also to 
recommend the right crops which can be grown in that soil. After suggesting the right crops, the right 
fertilizer or nutrients needed for the suggested crops are also recommended. The proposed architecture has 
soil classifier, comparison module, prediction module, crop recommender and nutrition recommender 
modules as illustrated in the Figure 1. 

If the farmer knows about the crops which can be grown in his land, he can input the same to the 
system or else the soil sensors can be used to capture the soil parameters like NPK, pH level, moisture, 
temperature, electric conductivity, organic carbon level and so on. The input is next transferred to soil 
classifier module, along with user input to classify and move the control to desired module. If the Soil type is 
known, AI comparison module will work with the help of pre stored Nutrition dataset which is comprised of 
soil parameters, crops, fertility rate or sensor reading. This module predicts fertility rate of the soil for the 
user known crop. If the farmer does not have any idea about the type of soil, the classifier module would 
transfer the control to AI prediction module to predict the fertility rate and the best crop which can be grown 
in his soil with the help of Nutrition dataset. If farmers migrate from one place to another or if farmers want 
to grow new crop in their land then crop prediction will be really beneficial. The architecture of the 
prediction module is shown in Figure 2. 

The crop recommender module along with the crop data set which has soil parameters and the crops 
grown data, would suggest the other four to five crops which could be grown using the given soil sample. 
The nutrition recommender module would take up the output of crop recommender system and along with 
Fertilizer dataset, suggests the required nutrients or fertilizers for enhancing fertility that will help to grow 
each crop suggested. The fertility rate of the soil, the crop recommendations and the nutrition or fertilizer 
suggestions would be stored in cloud and sent in the form of soil reports to the farmers to their phones in 
easily readable and understandable format. 
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Figure 1. Architecture of the ‘Soil fertility & crop friendliness detection & monitoring system’ 
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Figure 2. Architecture of AI prediction module 


3. DATASET 

With the help of the website of Indian Government and soil testing sensors, data has been collected 
from some regions of Mysore district and Kodagu district of Karnataka. The crops chosen are paddy, coffee, 
maize, cowpea, red gram, banana, groundnut, areca nut, coconut, jowar, green gram, ragi, black gram, 
pepper, cashew nut, sugarcane. There are 16 crops collected and chosen having 300 entries for each crop 
hence total of 4800 rows of data is present. Figure 3(a) represents the Nutrition dataset and its distribution. 
The soil parameters considered are nitrogen (N), phosphorous (P), potassium (K), temperature, humidity, pH, 
rainfall, electric conductivity and organic carbon. The crop column is considered as the target since the crop 
label or class has to be predicted from the soil parameter values. The dataset scatterplot matrix represents the 
relationship between soil parameters as shown in the Figure 3(b). 
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Figure 3. Dataset representation in (a) crops distribution in collected nutrition dataset and 
(b) dataset scatterplot matrix 


4. METHODOLOGY & RESULTS 
4.1. Deep neural network model 

In this paper we have discussed more on AI prediction module to predict the best crop. We have 
used keras deep learning neural network for building this module [16]. Keras model is flexible and easily 
deployable in nature. Keras sequential model is a simple network model used for classifying the multiple 
crops where Multi class classification is done for categorizing the crop into the particular class and predicting 
the right crop. 

Google Colab platform is used for implementing the model, important libraries like Scikit-learn, 
Numpy, and Pandas, necessary classes and functions are loaded and imported. Dataset which is collected 
comprising of 4800 rows has been stored in CSV format and loaded into our Colab file. Data shuffling and 
normalization is done as initial steps to avoid biasing and redundancy and to increase the accuracy of the 
model. Dataset is split into 80% training data and 20% testing and validation dataset. Since the output is 
categorical, it is encoded, later the neural network and its layers are defined. Considering the number of soil 
parameters which are meant for input, input dimension is defined. With Relu activation function the input 
layer and hidden layers are built. Softmax activation function is used at output layer, we have used Adam 
optimizer for higher efficiency and loss function with categorical cross entropy is used. Keras classifier is 
passed to fit function to train the model and run with 100, 150 200 and 500 epochs, the loss and accuracy is 
obtained and analysed and later prediction is performed. The summary of result after 500 epochs is 
represented in Figure 4(a) and crop prediction instance having predicted values against the real values are 
shown in Figure 4(b). Looking at the loss curve graph as shown in the Figure 5, we can say the learning rate 
is high with negligible difference between training and validation loss points and the model is a good fit. The 
Keras deep learning model implemented and run for the collected dataset gives the accuracy of classification 
approximately 87%. 
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C summary of the result after each epoch: 


less accuracy val_loss val_accuracy epoch 


495 0.197731 0.867708 0.200129 0.854167 495 
496 0198495 0.869531 0.223541 0.862500 496 
497 0193396 0874479 0218459 0.835417 497 
498 0195570 0.868490 0222443 0.852083 498 
499 0197824 0869531 0213771 0.868750 499 
(a) 
N P K temperature humidity ph rainfall 
paddy 42.00 43.00 20.8797 82.0027 6.50299 202.936 
maize 21.00 123.00 25.1421 65.2619 6.2 76.6846 
pepper __ ||coffee 91 21.00 26.00 26.3338 57.3647 7.26131 191.655 
ragi ragi 0 18.20 127.00 18.153 19.386 6.8 194.577 
blackgr... |(blackgram 0 27.00 558.00 23.38 21.9888 69 182.904 
arecanut |/arcenut 70 5.36 5.50 39.7315 91.1222 6.73 122.763 
(b) 


Figure 4. Results of our neural network model: (a) result summary of our model and (b) an instance of 
crop prediction 
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Figure 5. Loss curve graph of our model 


4.2. The other machine learning classifiers implemented are 
4.2.1. K-nearest neighbor (KNN) 

KNN is the simplest algorithm used for classification of multiclass dataset which can be also used 
for crop recommendation [17]-[20]. The similarity between the available data and new data is assumed and 
the new data is put into the category which is most similar to the available category. The ‘K’ has to be 
selected first, then it requires computation of distance between data points or K neighbors using method like 
Euclidian distance. The KNN classifier yields the accuracy of 64.7% for our dataset. 


4.2.2. Decision tree classifier 

Decision tree classifier is a well know ML classifier which can be also used for predicting crops 
[21], [22]. A Tree is constructed using recursive binary splitting method. The tree is iteratively split until 
lowest subsets are obtained. Whenever is new data is considered for categorizing, trail of tests which is 
arranged in hierarchical way are performed to obtain the class label. The decision tree classifier yields the 
accuracy of 70% for our dataset. 


4.2.3. Support vector machine (SVM) 

SVM is best suited multi class classification for predicting crops [17], [20], [23]. A Hyperplane is 
created from SVM algorithm by choosing the vector points which will be the best decision boundary, this 
hyperplane is used for taking decision and segregate the data points to proper category where it belongs to. 
SVM is an efficient algorithm which works well in high dimensional space. The Support vector classifier 
yields the accuracy of 71.56% for our dataset. 


4.2.4. Gaussian NB 
Naive Bayes (NB) is a well know ML classification method used for crop recommendation [17], 
[18], [22], [23]. Gaussian NB is special kind of Naive Bayes classification method where features are 
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assumed to have normal distribution. Naive Bayes works using Bayes theorem where conditional probability 
is calculated. Gaussian NB is a faster algorithm to categorize high dimensionality data. The Gaussian NB 
classifier yields the accuracy of 72% for our dataset. 


4.2.5. Linear discriminant analysis (LDA) 

Linear discriminant analysis can be used for well-known classification problems in agricultural 
domain [24], [25], LDA is used for classification as well as dimensionality reduction. LDA is a simple 
classification algorithm which finds linear combination of characteristics to categorize into two or more 
classes hence it is a preferred linear classification technique. LDA divides the classes into two or more 
groups by showing spaces in higher dimensions and in lower dimension. The Linear Discriminant Analysis 
classifier yields accuracy of 60.9% for our dataset. The results having confusion matrix of the implemented 
classifiers are represented in Figures 6 and 7. The comparison of accuracies of deep neural network and 
various classifiers are shown in the Figure 8. 


P Support vector’s Accuracy: 0.715625 o 
{153.2 16.007 6.6 3 bce 6, 8 le 6 ee) Guassian NB's Accuracy: 0.7208333333333333 
tono o 0.06; 66 es! 6 (6's 6.6) C 143 0 018 100000000 0 © 0} 
[0 04 010000060 0 @ 8 oo) (975 © © 000000100 0 @ e] 
isi-6 1660020 (0 60616 0.3 (0 6 6:6} [9 032000000000 0 $13 0) 
Li D a 06016 6 G10 (6. 6: 6.6 6 6:6) (51 0 0112000000000 0 0} 
[© 000043000005 0 0 6 0) [@ © © 0620000060 0 0 © 0} 
f0 0160) 6160 06:6. 61.6 ERT [00 00009000000180 0 0) 
[200 000040017700000) YG) (9) ORO 00Sa 00a O66, 079, (0) 
Ce 0 10 862616 656.60) 010 09T] [100000059 002001 0 9} 
[00 000000053000000) Le D00 ©; OO 03610 970 0 010 9] 
[20000009 0 241000 0 0) CO 8) 8.8) ©. O10 © O50) O16 0, 08 23) 
[00000000000580 © @ a) £8) 0), 0761.8, 010 370 10:81 20) 0: O16, 8) 
[0000 062000004000 0) Eee, OFS 81-8 10) OTS A Cele; E 
[9 0 0 0 055 0 0 0 0 O O11 00 0} 
[0070000000000533 0) 
eleis o bio Cano ode 2 eve A ae] [0 010 000000000 04130) 
[0000000000000 364 0) 
[0 © 000000066 00 0 O 0 0)]) -->confusion matrix Le ©. oa 6) 6.6. 0) 6.48) @ 18 0 0: 6:38)) a o ‘actris 
precision recall fi-score support precision recall fi-score support 
arecanut 0.48 0.84 0.61 62 
banana 0.97 0.99 0.98 76 canes pes oe ib 5> 
blackgram 0.84 0.82 0.83 so Reed 0.76 0.6 0.70 se 
cashewut 0.00 0.00 0.00 6s cashewnut 0.38 0.17 0.24 64 
coco red pened aise 5 coconut 0.95 1.00 0.98 62 
coffee . . . coffee 0.35 0.62 0.45 48 
compea 1.00 0.96 0.98 52 cowpea 1.00 1.00 1.00 $2 
greengran 0.81 0.70 0.75 63 areenaren 0.95 0.94 0.94 6 
groundnut 1.00 1.00 1.00 56 groundnut 1.00 1.00 1.00 56 
jowar 0.44 1.00 0.61 33 jowar 0.38 0.57 0.46 53 
maize 0.67 0.76 0.71 54 maize 0.94 0.94 0.94 54 
paddy 0.83 1.00 0.91 $8 paddy 1.00 1.00 1.00 58 
pepper 0.00 0.00 0.00 66 pepper 0.38 0.17 0.23 66 
ragi 0.82 0.84 0.83 63 ragi 0.82 0.63 0.71 6 
redgran 0.92 0.91 0.92 67 redgram 0.71 0.96 0.82 67 
sugarcane 0.00 0.00 0.00 66 sugarcane 0.44 0.27 0.34 66 
accuracy 0.72 960 accuracy 0.72 960 
macro ave 0.62 0.73 0.66 960 macro avg 0.72 0.72 0.71 960 
weighted avg 0.62 0.72 0.65 960 weighted avg 0.72 0.72 0.71 960 
ee cee ee oe 


Figure 6. Classification report of SVM & Gaussian NB classifiers 
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Figure 7. Classification report of KNN, decision tree and LDA classifiers 
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Figure 8. Accuracy comparison of all the implemented ML classifiers and deep learning model 


CONCLUSION 
We have implemented deep neural network and many machine learning algorithms as classifiers for 


our collected Nutrition dataset and obtained the accuracies of our neural network model and other classifiers. 
Our deep learning model has 87% accuracy and we obtained 71.5%, 72%, 65%, 70%, and 61% accuracy for 
SVM, Gaussian NB, KNN, decision tree and LDA respectively. So, the deep neural network has highest 
accuracy among all our algorithms and models with added advantages of its self-learning ability, robustness, 
flexibility and many more things. Classification is 87% accurate which is really a better result for predicting the 
right crops and help out the farmers to choose the most appropriate crop for his land based on the soil properties. 
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